Context
Aaron Penne submitted a beautiful visualization for Same sex marriage laws in Reddit’s DataViz Battle Feb 2018. He used violin plots which are used for plotting distribution of continuous data for showing a smooth animation of categorical data. I really liked this idea and decided to visualize game of thrones screen time for the most famous characters. Although game of thrones data sets have been mined till death still this is an interesting visualization.
Data processing
Most game of thrones data sets have aggregated data by season but I wanted a more detailed data set because an animation would be smoother if the number of frames (or plot per data point are higher). I found a json data set created by Jeffrey Lancaster which had screen time per episode for every single character. He has painstakingly created the data set even manually watching each episode and using a scratch paper to note down details! Please check out his blogs.
Shoutout to Gaurav Modi for helping me in the initial data parsing part. I have detailed down the entire process in the python notebook which involves loading the data, taking time differences and writing it to a csv for later usage.
Data Visualization
The most challenging part was to visualize a categorical data structure format which the violin plot could plot. I only filtered for the 4 most famous characters - Arya, Jon, Danny and Tyrion. As the data was on a episode, season level summing up the total seconds, it had to be turned into a distribution. I coded each character as 1,2,3,4 and then for each second in the screen time generated 1 (for Arya). To elaborate if Arya had 40 seconds in episode 1, Jon had 30, Tyrion had 20 and Danny had 0 then the data for S01E01 would look like [1,1,1...1,2,2...2,4,4...4]. This list has 40 1’s. 30 2’s, 0 3’s and 20 4’s. This can now easily be plotted by a violin plot.
I then generated a distribution for each episode and stitched them into a gif using Aaron’s code(easier said then done). To make the plot smoother I used cumulative screen time